For TY 22, appeals are scored only for bulk no change.

Current Scope

Bulk appeal settlement requires (1) scoring appeals based on likelihoods of reductions justified and (2) determining the threshold at which appeals will be bulk settled.

For example, imagine two properties in two neighborhoods. Property one is a very heterogeneous neighborhood where nearby properties vary greatly in value and quality. Property two is a very homogeneous neighborhood where all properties are very similar. In such a scenario, we would score the reduction likelihood much higher for property one than property two. How (or if) the appeals would be bulk settled would be determined by the score threshold at or above which an appeal would be bulk settled. Depending on the scores and the threshold(s), either property one, both property one and two, or neither property may meet the criteria.

Appeal Scoring

Typically, each appeal is evaluated by an analyst selecting a handful of comparable properties which meet a series of ranges or pre-defined criteria. These comps often include nearby properties which are similar in age, size, and construction to a target property. These comps will then be used to construct an estimate of normalized value to be applied to the target property such as price per square foot.

The core principle of appeal scoring is to estimate the variance/level of uniformity in assessments. In the current version, a higher score should represent greater confidence in the current valuation and corresponding no change decision. Proposed factors to consider include property level and neighborhood level factors.

Property Level Factors

Comparable properties

  • Count of comparables
  • Comparison to distribution of comparables by building assessed value per building square foot
  • Proportion of standard filter comparables with an av per building square foot less than the target property

Neighborhood Level Factors

Assessment accuracy

  • Met IAAO Sales Ratio Metrics (COD, PRD, PRB)
  • Calculated on most recent year of sales against reassessments for assessor neighborhoods

Scoring Methodology

Let’s score a pin as an example from 2021 (05273010150000).

Below is the information collected on this property.

Scores are out of 100 as follows:

  • 20 pts, 1st cut no change
  • 30 pts, 2nd cut no change
  • 5 pts, below 50th percentile comp bldg av per sqft
  • 5 pts, below 20th percentile comp (property is valued much lower than comps)
  • 25 pts, share of comps greater than target (percentage times 25)
  • 15 pts, per IAAO standard met on NBHD (if not enough sales 0 pts)

Note: a change on 1st or 2nd cut leads to a score of 0

Our property here receives a score of 66.6. 50 pts since 1st/2nd cut had no change. 5 pts since prd was met. 11.6 based on the share of comps greater than the target property (47.5%).

Bulk Threshold

Scoring the level of uniformity in appeals creates an ordered list of appeals by likelihood of non-uniformity. These scores can then be used in workflows through categorization (high, medium, or low variance appeal) or to issue bulk determinations based on thresholds.

The core principle of threshold setting is that some areas of Cook County are uniform enough that reductions cannot be justified except in extreme circumstances. While there will always be some level of variability in assessments, it would not make sense to grant a reduction of 1000 against a property with an assessment of 100,000 or any reduction below a certain percentage of a property’s value (say 5%). Properties which fall below this threshold would then only be justified in receiving a reduction if some extenuating circumstances exist such as very low building condition or some amount of uninhabitability.

Another aspect of thresholds is that they can be set at different levels by classes, townships/tris, or cuts. It may make sense to set the most generous threshold at 1st cut for a heterogeneous township and the least generous threshold for a very homogeneous township at 3rd cut.

Tax Year 2021 Simulation

Run on regression class residential property (class 202-212, 234, and 278). There were approximately 1.04 million parcels (55% of total) which received assessments in this category and 196,000 appeals filed (66% of unique 10 digit pins total appeals).

Below is a table which presents the results of the simulation for the appeals. Note that at this time, the simulation was done assuming that all appeals are 1st cut. This is the most difficult scenario and would not be recommend as the first course of action.

In order to determine a threshold, we want to look at the separability of the two classes as defined by the actual BOR decision from 2021 (change or no change). Below are a series of plots which show the distribution across various metrics.

Setting a Threshold

As discussed, the current version of bulk appeals is that a threshold would be a decision boundary at which appeals could be automatically adjudicated no change. The current methodology is a rudimentary classification algorithm which can be evaluated utilizing the standard techniques.

Evaluation Metrics

Confusion Matrix

A confusion matrix shows the classified result versus the truth. For our situation:

  • A true positive is an appeal with result no change that was classified no change (top left)
  • A true negative is an appeal with result decrease that was classified decrease (bottom right)
  • A false positive is an appeal with result decrease that was classified no change (top right)
  • A false negative is an appeal with result no change that was classified decrease (bottom left)

Let’s demonstrate this with an example threshold of 25.

We can see that we have a lot of mistaken predictions at this threshold. This is in part because our heuristic cannot adequately distinguish between the two classes with the limited information it has. Let’s analyze the grid above again.

(Prediction - Truth)

Situation 1: No Change - No Change 74249 40%

This is the ideal situation where an appeal would be ‘correctly’ classified as no change just as an analyst would.

Situation 2: No Change - Decrease 18852 10%

This is the worst case scenario where an appeal would be ‘incorrectly’ classified as no change even though an analyst would provide a decrease. Our goal by threshold setting is to avoid this.

Situation 3: Predicted Decrease 91982 50%

This is a situation where an analyst would evaluate the appeal. In most cases of this category, the analyst found a decrease warranted. A secondary goal is to minimize the number of false negatives where an appeal was predicted decrease but an analyst found no change.

For example, let’s set the threshold at 35 instead.

We can see that the number of ‘bad’ false positives has decreased, but the number of appeals to be manually evaluated (false negatives plus true negatives) increases.

Other Notes

Analyst Evaluation

Scores and the threshold value will be easily available to analysts. An accessible feedback form will allow analysts to flag mistaken scores or threshold values.

Making Automated Valuation Determinations

Generating change valuation determinations at this time would be non-uniform since it is mid-cycle. This methodology will be established throughout the coming months.

Other Appeal Criteria

Other criteria, such as sales in the past three years, can also be implemented if specified.